Efficient Computation of PageRank
نویسنده
چکیده
This paper discusses efficient techniques for computing PageRank, a ranking metric for hypertext documents. We show that PageRank can be computed for very large subgraphs of the web (up to hundreds of millions of nodes) on machines with limited main memory. Running-time measurements on various memory configurations are presented for PageRank computation over the 24-million-page Stanford WebBase archive. We discuss several methods for analyzing the convergence of PageRank based on the induced ordering of the pages. We present convergence results helpful for determining the number of iterations necessary to achieve a useful PageRank assignment, both in the absence and presence of search queries.
منابع مشابه
Web-Site-Based Partitioning Techniques for Efficient Parallelization of the PageRank Computation
The efficiency of the PageRank computation is important since the constantly evolving nature of the Web requires this computation to be repeated many times. PageRank computation includes repeated iterative sparse matrix-vector multiplications. Due to the enourmous size of the Web matrix to be multiplied, PageRank computations are usually carried out on parallel systems. Graph and hypergraph par...
متن کاملPageRank Computation Using PC Cluster
Link based analysis of web graphs has been extensively explored in many research projects. PageRank computation is one widely known approach which forms the basis of the Google search. PageRank assigns a global importance score to a web page based on the importance of other web pages pointing to it. PageRank is an iterative algorithm applying on a massively connected graph corresponding to seve...
متن کاملAn Overview of Efficient Computation of PageRank
With the rapid growth of the Web, users get easily lost in the rich hyper structure. Providing relevant information to the users to cater to their needs is the primary goal of website owners. Therefore, finding the content of the Web and retrieving the users’ interests and needs from their behavior have become increasingly important. Web mining is used to categorize users and pages by analyzing...
متن کاملJordan Canonical Form of the Google Matrix: A Potential Contribution to the PageRank Computation
We consider the web hyperlink matrix used by Google for computing the PageRank whose form is given by A(c) = [cP + (1 − c)E]T , where P is a row stochastic matrix, E is a row stochastic rank one matrix, and c ∈ [0, 1]. We determine the analytic expression of the Jordan form of A(c) and, in particular, a rational formula for the PageRank in terms of c. The use of extrapolation procedures is very...
متن کاملEfficient Parallel Computation of PageRank
PageRank inherently is massively parallelizable and distributable, as a result of web’s strict host-based link locality. We show that the Gauß-Seidel iterative method can actually be applied in such a parallel ranking scenario in order to improve convergence. By introducing a two-dimensional web model and by adapting the PageRank to this environment, we present efficient methods to compute the ...
متن کامل